﻿ How Could Veins Speed Up The Process Of Discourse Parsing Elena Mitocariu1, Daniel-Alexandru Anechitei1, Dan Cristea1,2 1 “Alexandru Ioan Cuza” University of Iasi, Faculty of Computer Science 16, General Berthelot St , 700483 – Iasi, Romania 2 Romanian Academy, Institute for Computer Science {elena mitocariu, daniel anechitei, dcristea}@info uaic ro Abstract In this paper we propose a method of reducing the search space of a discourse parsing process, while keeping unaffected its capacity to generate cohesive and coherent tree structures The parsing method uses Veins Theory (VT), by developing incrementally a forest of parallel discourse trees, evaluating them on cohesion and coherence criteria and keeping only the most promising structures to go on with at each step The incremental development is constrained by two general principles, well known in discourse parsing: sequentiality of the terminal nodes and attachment restricted to the right frontier A set of formulas rooted on VT helps to guess the most promising nodes of the right frontier where an attachment can be made, thus avoiding an exhaustive generation of the whole search space and in the same time maximizing the coherence of the discourse structures We report good results of applying this approach, bringing in a significant improvement in the discourse parsing process Keywords: veins theory, discourse structure, incremental discourse parsing, reduction of the execution time - a coherence criterion – computes a global 1 Introduction smoothness score of a discourse by summing up Discourse parsing has traditionally dealt with short texts Centering transitions scores (Grosz et al , 1995) such as newspapers and articles, but new approaches such in which the utterances (discourse units) are as analysis of lexical repetition (Boguraev and Neff, ordered hierarchically, i e along veins, not 2000), identification of topics (Utiyama and Isahara 2001) linearly The incremental parsing technique, described in or using of thematic hierarchy of text (Nakao, 2000), (Anechitei et al , 2013) and rooted on an approach takes into consideration also longer texts introduced in (Cristea and Webber, 1997) and (Cristea et Discourse parsing systems combine lexical, syntactic and al , 1998a), uses these two criteria to guide a beam search semantic features to generate representative discourse process in a space of partially developed discourse trees trees But discourse parsing is often intended to work on At each step, the parser retains the most promising N trees large texts, characterized by complex analyses, which are among those obtained after adjoining an auxiliary tree on normally obtained over expensive processing time the right frontier of the developing structure, where N is Different theories present various types of discourse determined by the space-speed limitations of the machine structure representations such as trees (Mann and accommodating the parser Thompson, 1988) and graphs (Asher and Lascarides, During the development of the discourse tree, two 2003) principles are consistently observed at each step of the Discourse theories divide the text in spans which are incremental process: connected through different type of relations Rhetorical a) the Sequentiality Principle (Marcu, 2000); Structure Theory (RST) and Segmented Discourse b) the Right Frontier Constraint, stated empirically Representation Theory (SDRT) (Asher, 1993) postulate by many scholars, as Webber (1991), possible asymmetries of the relation arguments: when two (Afantenos and Asher, 2010) text spans are in a certain relation one with respect to the Our approach applies to binary trees and takes into account the right frontier constraints (RFC) (Webber, other, one can play a “subordinate” (less important) role relative to the other This asymmetry is expressed in RST 1991; Cristea, 2005) As was demonstrated in (Afantenos as a distinction between nuclei and satellites and in SDRT and Asher, 2010) RFC can be formulated for SDRT, as a distinction between coordinating and subordinating which makes our model extensible on discourse parsing relation links (Danlos, 2008) systems that perform on SDRT representations as well In Another model of discourse structure, Veins Theory (VT) this paper we show how a set of formulas rooted on (Cristea et al , 1998a), places centrally the nuclearity of Centering Theory (CT) (Grosz et al , 1995) and VT aid relations, revealing hidden structures on discourse trees, maximizing the coherence of a discourse tree We present called veins, which emphasise the manifestation of the results of applying our method on a discourse parser cohesion and coherence properties of discourses system that generates binary trees in which leafs are According to VT, a discourse tree correctly characterizes elementary discourse units (edus), such as clauses or short a text if two criteria are maximally realized on the set of sentences, and internal nodes represent larger text spans veins corresponding to the discourse units: RST uses a labeling function that attaches relation names - a cohesion criterion – computes a score and nuclearities to its inner nodes, while VT ignores associated to the resolution of anaphors on names of relations Because of this simplified antecedents placed on veins; representation, one can say that VT is included in RST (Mitocariu et al , 2013) In the next sections we briefly 2871present a discourse parser and both theories (CT and VT) By averaging the transitions over the whole discourse, a Then we will focus on describing the set of formulas and global Centering score is obtained (Figure 1) (Cristea et how they can be applied to plan coherent discourse structures while also reducing the search space of al , 1998a) which reflects the coherence of the text Here, attaching nodes on the RF Finally we analyze the results by TScore we denote the transition score between each and draw some conclusions two consecutive utterances 2 Discourse Parsing Discourse structures have a central role in several Figure 1: General Centering Score computational tasks, such as summarization, question-answering (QA), information extraction (IR), etc 2 2 Veins Theory Discourse parser systems are developed taken into account different features Some are based on semantic Veins Theory makes two important claims: the first discourse proprieties and other use syntactic characteristics of texts regards discourse cohesion, the second – Usually, discourse parsing systems combine these coherence VT extends CT from a local to a global level It features to generate representative discourse trees, which, takes from RST the binary tree representations of among others, can root approaches aiming at discourse structures and the notions of nucleus and summarizing texts We believe that a good summary satellite, but leaves out the names of relations extracted from a discourse structure is one that, besides The leaves of the discourse tree represent elementary the fact that it must give a shorter overview over the text, discourse units and the internal nodes, including the root, should preserve the qualities of being cohesive and represent larger spans of text When two nodes have a coherent This is why the use of referential expressions is common parent it means that they are in an anonymous of primary interest for obtaining coherent discourse discourse relation and in this relation they have a nuclear structures role (N) or a satellite role (S) A nucleus is more important than a satellite, such that if a nucleus would be eliminated, 2 1 Centering the text would loose coherence, but if a satellite would be eliminated, it would loose some details but its coherence Centering Theory (CT) is one of the most influencing would remain unaffected The material nodes (elementary theories in explaining coherence properties of discourses discourse units, leafs) are supposed to be identified by It estimates coherence between two adjacent utterances by individual labels VT introduces two expressions (which placing transitions on a scale of 5 layers, from the most represent sequences of material nodes), called head and easiest to interpret (CONTINUATION) to the most vein, computed as follows: difficult (NO CB) The classification of transitions into The Head expression of a node is meant to identify the five types resides on the notion of center (as semantic sequence of the most salient material nodes in the span representations of referential expressions) and their covered by that node Head expressions are computed sharing between adjacent utterances Each utterance bottom-up: (discourse unit) sets a list of forward-looking centers – ─ if the node is a leaf, its head expression is its label; Cf(Un) – as the centers realized in the current utterance ─ else, the head expression is the concatenation of the For each discourse unit other than the initial one, a head expressions of its nuclear children backward-looking center Cb(Un) can be determined, as The vein expression of a node n is meant to signify the the first center of Cf(Un) which exists also in Cf(Un) sequence of elementary discourse units which are -1 This definition allows also a lack of Cb, when the two sufficient to understand the span covered by the node n in consecutive utterances do not share a common center the context of the whole discourse In the definition of From the elements of the Cf list the highest-ranked vein expressions the following functions, taking as member is called the preferred center Cp(Un) The five arguments sequences of labels, are used: seq, returns the right frontier reordering in the left to different types of transitions between pairs of successive ─ discourse units (Un, Un+) are the following: right order of the sequence given by of the concatenation 1  CONTINUING (score 4): of its arguments; mark returns the same symbols as in its argument, but Cb(Un+1) = Cb(Un) OR Cb(Un) = NULL ─ Cb(Un+1) = Cp(Un+1) marked in some way (for example, between parentheses  RETAINING (score 3): or primed); simpl eliminates all marked symbols from its argument Cb(Un+1) = Cb(Un) OR Cb(Un) = NULL ─ Cb(Un+1) ≠ Cp(Un+1) With these, vein expressions of all nodes in the discourse  SMOOTH SHIFTING (score 2): tree are computed top-down, as follows (Cristea et al , Cb(Un+1) ≠ Cb(Un) 1998a): the vein expression of the root node is its head Cb(Un+1) = Cp(Un+1) ─  ABRUPT SHIFTING (score 1): expression; if the node is a nucleus and its parent’s vein expression Cb(Un+1) ≠ Cb(Un) ─ Cb(Un+1) ≠ Cp(Un+1) is v, then: if the node has a left satellite sibling with head h,  NO Cb (score 0) • 2872then its vein expression is seq (mark(h), v); After each adjunction operation, potentially applied onto • else, v; each node of the right frontier, a forest of developing trees is obtained This leads to an exponential explosion of the ─ if the node is a satellite with the head h and its parent’s vein expression is v, then: developing structure, which can be mastered by ranking • if it is a left daughter, then its vein expression is the trees on a global score associated to each tree, and seq(h,v); using for the next step only the best placed ones (a kind of • else its vein expression is seq(h,simpl(v)) beam search) In the research reported here we are preoccupied to cut own the computational complexity of the search during a 2 3 Methodology d VT-guided discourse parsing process We show that at The method we describe in this paper was applied on a certain steps during the incremental process, when discourse parsing system that runs on multiple languages sufficient information exists, it is possible to keep open (Bulgarian, German, Greek, English, Romanian and for adjunction only a subset of the right frontier, this way Polish) and produces summaries (Anechitei et al , 2013) drastically reducing the explosion This is done by The system architecture process the text in the following focusing the adjunction on those nodes which maximize consecutive steps: sentence splitting, tokenization, the chance to solve referential links on veins, by part-of-speech tagging, lemmatization, noun phrase exploiting also details on the cropped and the auxiliary extraction, named entity recognition, anaphora resolution, trees clause splitting and discourse parsing Since it summarizes thousands of documents per day, there is a 2 4 Centering on veins (VT score) demanding necessity to improve its efficiency The discourse parser applies an incremental strategy in As shown in Section 2 1, VT suggests to associate scores developing the trees, at each step observing the principle to Centering transitions, this way becoming possible to of sequentiality (Marcu, 2000) and the RFC (Webber, quantify the coherence of a text In Centering, transitions 1991) are computed on pairs of adjacent units within the borders The incremental development of a discourse tree is of each segment of the discourse VT argues that this performed by continuously applying two operations computation can be generalized to the whole discourse by umming up the CT transitions’ scores on domains of inspired by Tree Adjoining Grammars (Joshi and Schabes, s 1997): adjunction and substitution According to Cristea referential accessibility (DRA) The DRA of a unit u is and Webber (1997), out of the two, only adjunction allows given by the units in the vein expression of u that precede for more options at each step (the whole generalized right u, and the units are not necessarily adjacent any more In frontier positions), while the substitution operation is Example 1, with its corresponding discourse tree structure always performed in a well determined node (the represented in Figure 3, the centering score on veins is inner-most substitution node) The adjunction operation, computed different from centering score proposed in sketched in Figure 2, involves an initial/developing tree (Grosz et al , 1995) (D-treei) and an auxiliary tree (A-tree): it replaces the -1 foot node of the auxiliary tree with the tree cropped down Example 1: the adjunction node from the D-treei and then it inserts -1 the modified A-tree in the adjunction node, resulting thus 1 As John came nearer, a new developing tree (D-treei) 2 he saw that the two men were his brothers, 3 who came from far away, 4 and he said 'I am happy to see you ' Centering is defined as a local theory of discourse structure, which makes it applicable only inside text segments If the declared borders of CT would be forced and scores would be computed also over segment boundaries, no significant pairs’ transitions scores would add to the overall score, because at most of segment boundaries no Cb’s could be computed (thus, transition score equaling zero) This is the very definition of segment borders As such, summing up the scores of all segments or totaling the overall score of the discourse as belonging to just one large segment would rather make no difference If a unit has a predecessor in classical Centering, immediately to its left, in VT it is placed on a unit’s DRA, therefore on the vein of some unit Vein Figure 2: Adjunction operation involves a D-tree and an expression s, and hence DRAs, can skip segments’ borders A-tree and producing a forest of D-trees as defined in CT As such, a text of N units in length adds on the overall score in VT the same number of transitions In Figure 2, the developing tree (D-treei) represents the as in CT Computation of CT scores could be extended to -1 discourse structure of the already analyzed text and the the whole discourse and a comparison could be drawn A-tree represents the discourse structure of the subtext between the global extended CT score and the global VT processed in one step (an elementary discourse unit or a score VT claims to be a global theory, because the small discourse tree representing a sentence) segments limits are no more significant Both theories 2873employ the same five transition types presented in Section 3 The method 2 1, but VT claims that when considering transitions over veins they are consistently smoother In the process of building discourse trees, a great importance is represented by the relationship between reference chains and the discourse structure (a manifestation of cohesion) on one hand and, on the other hand, between reference chains and the smoothness of centering transitions (a manifestation of coherence) (Cristea et al , 2005) We consider the Veins Theory a necessary step for observing the link between referential expressions from the incoming text segment and whole discourse A set of formulas deduced from VT rules helps to guess the most promising nodes of the right frontier where an adjunction can be made The vein expressions of a node of the right frontier allows to predict how it will be changed in case an adjunction operation would be operated on it, knowing only the nuclearity of the adjoining node, the nuclearity configuration below the A-tree root node (N S, N N or S N) and the referential chains that link the Figure 3: The CT and VT scores (H = head expressions, material node of the A-tree onto the previous discourse V = vein expressions) With this information in hand, those nodes belonging to the right frontier which maximize a function of The outlined nodes are nucleus and the others are satellites As depicted in Figure 3, the CT score is referentiality can be computed This function counts the computed in the following order: (1, 2), (2, 3), (3, 4): number of referential expressions belonging to the material node of the A-tree whose coreference chains Cf(U1) = {[John]}; intersect the vein expression after the adjunction on a Cb(U1) = [John]; certain node of the right frontier of the D-tree Then, by taking the decision to adjoin the A-tree onto one of these Cf(U2) = {he=[John], [the two brothers]}; points, we have adhered to a greedy strategy, assuming Cb(U2) = [John]; that the best choice made now will maximize the Cp(U2) = [John]; probability for the D-tree to further evolve onto the most Transition (1, 2) = CONTINUING; score = 4; cohesive and coherent structure Using this information, the best nodes from right frontier where the adjunction Cf(U3) = {who=[the two brothers]}; Cb(U3) = [the two brothers]; should be made can be predicted Thus, the search space Cp(U3) = [the two brothers]; for the attachment nodes is reduced, because only the Transition (2, 3) = SMOOTH SHIFTING; score = 2; nodes that contain referential expressions are targeted Also, the coherence of the discourse structure is Cf(U4) = {he=I=[John], you=[the two brothers]}; maximized, because the vein expression of the specific Cb(U4) = [the two brothers]; nodes will append the best fitting labels This implies a Cp(U4) = [John]; maximization of the VT score Transition (3,4) = RETAINING; score = 3 3 1 Detailed description of the set of formulas This leads to an average global CT score: (4+2+3)/3 = 3 The VT score is computed in a different manner, given by The set of formulas derives from VT and predicts what the vein expressions and DRA as described above The changes appear in vein expressions in the D-tree after an transitions are: (1, 2), (2, 3), (2, 4) The difference from adjunction is made If this is done without entirely the previous computation consists in the last pair: (2, 4), computing a whole tree for each of the adjunction instead of (3, 4) This will trigger a different Cb, because, positions on the RF of the developing tree, then a lot of now, “previous” with respect to unit 4 is unit 2 and not 3: computations are saved The prediction takes into consideration the nuclearity of the nodes below an Cb(U4) = John; adjunction node as well as the nodes below the root node Transition = CONTINUING; score = 4 of the A-tree As after each adjunction, only few nodes of the developing tree are actually modified, it would be very Thus, the average score is (4+2+4)/3 = 3 33, greater than in the CT case, indicating a smoother discourse good to maximize the combined score by comparing only the selected veins that differ from one tree to another of the adjunction forest The optimum adjunction node on the RF can thus be chosen, reducing also the computation time 2874 4) If the type of the A-tree root node is S N and adjunction is made in the root of the D-tree, then the following changes happen: - VM = seq(simpl(VM), mark(HPr)) If the update of the M-tree is done bottom-up on the right frontier then the process stops after the first satellite is met; - VP = seq (VP, VAr); - VPr= seq (HPr, HAr); 5) If the type of the A-tree root node is S N and Figure 4: Description of the trees after an adjunction adjunction is made in a nuclear node, then the operation involved in the set of formulas following things happen: In Figure 4 the root of the Partial tree is the adjunction - The head of the P-tree root is marked node belonging to the right frontier of the D-tree and the VD = seq(VD, HAr, simpl(HPr); Material tree is the right children of the root of the A-tree If the update of the D-tree is done bottom-up on the (former right sibling of the foot node) right frontier then the process stops after the first satellite is met; In the process of computing the formulas the following VAr = seq(VAr, seq(VPr, simpl(HPr))); notations were taken into account: - VP = seq (VP, HAr); - the vein expression of a node of P-tree is VP ; - VM = seq(VM, seq(mark(HPr), VPr); - the vein expression of a node of M-tree is VM ; - the vein expression of a node of D-tree is VD ; 6) If the type of the A-tree root node is S N and - the vein expression of P-tree root node is VPr ; adjunction is made in a satellite node, then the - the vein expression of M-tree root node is VMr; following things happen: - the vein expression of A-tree root node is VAr; - VP = seq (VP, VAr); - the head expression of a node of P-tree is HP ; - VM = seq(VM, seq((mark(HPr ),VPr)); - the head expression of a node of M-tree is HM ; - The head of the P-tree root is marked - the head expression of a node of D-tree: is HD ; VAr = seq(VAr, seq(VPr, simpl(HPr))); - the head expression of P-tree root node is HPr ; - the head expression of M-tree root node is HMr ; - the head expression of A-tree root node is HAr ; 7) If the type of the A-tree root node is N N and adjunction is made in the root node, then the Analyzing how the resulted tree is changed, nine cases following things happen: were discovered, determined by two factors: the presence of referential links between the incoming text and the - The new P-tree will be the previous D-tree and initial discourse and the type1 of A-tree root node (N S, it will copy all previous vein and head S N, or N N) They are presented below: expressions; - VM = seq(VM, VPr); 1) If the type of A-tree root node is N S and - VP = seq (VP, VAr); adjunction is made in the root of the D-tree, then the following things happen: 8) If the type of the A-tree root node is N N and adjunction is made in a nuclear node, then the - VAr = VPr; following things happen: - HAr = HPr; - HAr = seq(HAr, HPr); 2) If the type of the A-tree root node is N S and - VAr = seq(VAr, VPr); adjunction is made in a nuclear node, then the - VD = seq(VD, VAr); following things happen: If the update of the D-tree is done bottom-up on the right frontier then the process stops after the first - VAr = VPr; satellite is met; - HAr = HPr; - VP = seq (VP, VAr); - VM = seq(VM, simpl(VPr)); 9) If the type of A-tree root node is N N and 3) If the type of the A-tree root node is N S and adjunction is made in a satellite node, then the adjunction is made in a satellite node, then the following things happen following things happen: - HAr = seq(HAr, HPr); - VAr = VPr; - VAr = seq(VAr, VPr); - HAr = HPr; - VP = seq (VP, VAr); - VM = seq(VM, VPr) 1 By type, here and below, we mean the nuclearity configuration of children in the left to right order 28753 2 Exemplifying how the set of formulas are In Figure 5 the arrow points to the node where the used adjunction must be made Head (H) and vein (V) To understand how the set of formulas are applied, let's expressions are marked on each node The selection was consider the text of Example 2, already segmented in made because the attachment node is situated on the right seven units: frontier, it covers the node representing clause 3 and is the nearest to it The tree resulted after adjunction is presented Example 2: in Figure 6 As can be seen, the label of node 3 is placed on the vein expression of the node representing unit 7 A set of formulas derived from VT proves that if the A-tree 1 Makaha changes its Name root node is labeled N N and the adjoining node is 2 Makaha Inc said: nuclear (N), all the vein expressions contained in the 3 the CEO has decided that the new name will be D-tree will be kept unchanged but in the same time, new TerroCom information is added This is easily observed from the set 4 In a new release, the company said of vein expressions Thus, when the VT score is computed 5 the new name more accurately reflects focus on from this example, it will take into account the transition high-technology communications, between units (3, 7), which is also what we intended 6 including business and entertainment software, interactive media and wireless data and voice transmission 7 He decided to make this change starting with tomorrow The demonstration that follows is built on the supposition that the first six clauses of the text in Example 2 are already analyzed and the incremental process has reached the point where unit no 7 has to be adjoined to the right frontier of the developing tree Let’s note that this adjoining operation will trigger modifications of the vein expressions of some (or all) of the nodes of the terminal frontier This suggests the idea that is put at the core of our proposal: find that node of the right frontier where the adjunction of a new material node containing the current unit will prolong the vein expressions of the terminal Figure 6: Resulted discourse tree after adjunction nodes in the most profitable way operation The first step is to identify which units of the previous discourse contain antecedents for the anaphors contained Based on the set of formulas and analyzing how the in the current node For instance, in Example 2, we want resulting tree is changed, the selection for the best node of to add label 3 in the vein expression of unit 7 because both the RF where the adjunction takes place can be made by units 3 and 7 include references to the same entity This considering these cases: way the coherence of the text is kept high, since on the There is no referential link between a unit in the D-tree argumentation line of unit 7 there will be a transition and a unit in the A-tree: scored high (most probably CONTINUATION or RETAINING, conforming to Centering (Grosz et al , 1995) - If the A-tree root node is typed S N or N N: and VT (Cristea et al , 1998a) adjunction should be made in a satellite (S) node (an adjunction in a nucleus (N) would encumber the deletion of the head expression of the cropped tree); - If the A-tree root node is typed N S: do adjunction in either an S or an N (better in S because it keeps the marked nodes) There is at least one referential link between a unit in the D-tree and a unit in the A-tree: - If the A-tree root node is typed S N or N N: do adjunction in a nuclear node, which will cover the whole referents (the nuclear node for adjunction must be the one who contains most of the referential expressions); - If the A-tree root node is typed N S: do adjunction in a satellite node (the marked nodes will not be lost and therefore the coherence of Figure 5: Discourse tree representation of the text in text is preserved though maintaining the Example 2 referential expressions) 2876 4 Evaluation Analyzing the fields ADJ- and ADJ+, it can be noticed To evaluate our method we used the English part of the that the number of adjunction operations in classical corpus mentioned in the original VT paper (Cristea et al , incremental discourse parsing is larger than the number of 1998b), texts distributed by the Message Understanding adjunction operations performed when applying the set of Conference (MUC-7) This corpus includes 30 newspaper formulas From these results is easy to conclude that the texts whose lengths varies widely (average of 408 words execution time is reduced approximately by half For and standard deviation of 376 words) and are manually example, for the text MUC1 the number of operations for annotated for co-reference relations (Hirschman and ADJ- is 169 and for ADJ+ are 83 Decreasing the number Chinchor, 1997) and complemented with RST structure, of operations triggers the reduction of the execution time by Marcu et al , (1999) In Table 1 we present the results The second important finding is that the structures obtained comparing two different methods The first one obtained by applying the reduction strategy have a similar explores all the nodes belonging to the right frontier and quality as those obtained using the classical incremental the second one uses the method presented in this paper parsing Almost identical discourse structures are Comparisons were made taking into consideration two obtained with the feature switched on as with it switched factors: off (most of the comparison scores are 1 or very close to 1) Thus, the economy in running effort does not  the coherence of the discourse trees; negatively affect the coherence of the obtained structures  the structure of the discourse trees 5 Conclusions The coherence of the discourse tree was evaluated using We have proposed a set of formulas which may be used by the method proposed in Section 2 4 The discourse tree incremental discourse parsing systems to reduce the structures were compared using the measures proposed in number of adjunction operations on the right frontier and Mitocariu et al (2013) The significance of the labels in we demonstrated how these formulas help to maintain the the table is the following: coherence of the text and to reduce the complexity of the computations The incremental evaluation of tree  ADJ- : the number of adjunction operations with structures is based on Veins Theory We make use of the optimization feature switched off; referential expressions, the nuclearity of the adjunction  ADJ+ : the number of adjunction operations with nodes and the type of the auxiliary tree to select the best the optimization feature switched on; node to make the adjunction operation From the  CT : the classical Centering score; perspective of an incremental discourse parsing system,  VT- : the Centering score on veins with the this set of formulas is useful in processing long texts, optimization feature switched off; where it helps to reduce the search space of adjoining on  VT+ : the Centering score on veins with the the right frontier optimization feature switched on;  OS, NS, VS: scores for comparing discourse tree structures in terms of coverage, nuclearity and 6 References vein expressions (Mitocariu et al , 2013) Stergos D Afantenos and Nicholas Asher (2010) Testing T’s Right Frontier In Proceedings of COLING, pp SDR 9 Filename ADJ- ADJ+ CT VT- VT+ OS NS VS 1– MUC1 169 83 0 121 0 121 0 121 0 939 0 909 0 972 Daniel A Anechitei, Dan Cristea, Ioannidis Dimosthenis, MUC2 27 15 0 286 0 429 0 429 1 1 1 Eugen Ignat, Diman Karagiozov, Svetla Koeva, teusz Kopeć and Cristina Vertan (2013) MUC5 21 12 0 0 0 1 1 1 Ma Summarizing Short Texts Through a MUC7 33 18 0 0 0 1 1 1 Discourse-Centered Approach in a Multilingual MUC9 57 30 0 304 0 348 0 348 1 1 1 Context In Neustein, A , Markowitz, J A (eds ), Where MUC10 51 27 0 091 0 091 0 091 1 1 1 Humans Meet Machines: Innovative Solutions to MUC11 171 84 0 096 0 096 0 115 0 962 0 942 0 973 Knotty Natural Language Problems Springer Verlag, MUC13 21 12 0 333 0 5 0 5 1 1 1 Heidelberg/New York MUCl4 75 39 0 1 0 1 0 1 1 1 1 Nicholas Asher (1993) Reference to Abstract Objects in MUC16 69 36 0 588 0 647 0 647 1 1 1 Discourse Dordrecht: Kluwer Academic Publishers MUC17 9 6 0 6 0 6 0 6 1 1 1 Nicholas Asher and Alex Lascarides (2003) Logics of MUC18 30 15 0 667 0 75 0 667 0 917 0 917 0 984 Conversation Cambridge University Press MUC20 27 15 0 0 0 1 1 1 Branimir K Boguraev and Mary S Neff (2000) Lexical MUC22 273 132 0 11 0 11 0 134 0 951 0 927 0 962 cohesion, discourse segmentation and document summarization In Proceedings of RIAO Table 1: Comparing the two approaches Dan Cristea, Oana Postolache, Ionut Pistol (2005) Summarisation through Discourse Structure In the 6th The figures in the table above show that applying the set International Conference CICLing, Mexico City, 644 of formulas, the results of the discourse parsing system Mexico, pp 632– are similar or almost identical with the system running Dan Cristea (2005) The Right Frontier Constraint Holds without applying the set of formulas The results are Unconditionally In Proceedings of the important because they show a significant reduction of the Multidisciplinary Approaches to Discourse (MAD'05), computational effort 2877Chorin/Berlin, Germany Dan Cristea, Nancy Ide, Laurent Romary (1998a) Veins theory: A model of global discourse cohesion and coherence In Proceedings of the 17th international conference on Computational linguistics, pp 281-285, Montreal Dan Cristea, Nancy Ide, Laurent Romary (1998b) Marking-up Multiple Views Of A Text: Discourse And Reference In Proceedings of the First International Conference on Language Resources and Evaluation, Granada Dan Cristea and Bonnie L Webber (1997) Expectations in Incremental Discourse Processing In Philip R Cohen, Wolfgang Wahlster (eds ) Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics, Madrid Laurence Danlos (2008) Strong generative capacity of RST, SDRT and discourse dependency DAGs In Benz, A and P Kühnlein, editors, Constraints in Discourse, Pragmatics and Beyond New Series, pp 69–95 Barbara J Grosz, Aravind K Joshi and Scott Weinstein (1995) Centering: A framework for modeling the local coherence of discourse Computational Linguistics, 21(2), pp 203–226 Lynette Hirschman and Nancy Chinchor (1997) Muc-7 Coreference task definition In MUC-7 Proceedings Science Applications International Corporation Aravind K Joshi and Yves Schabes (1997) Tree-Adjoining Grammars In G Rozenberg and A Salomaa, editors, Handbook of Formal languages, pp 69-123, Springer, Berlin William C Mann and Sandra A Thompson (1988) Rhetorical structure theory: Toward a functional theory of text organization TEXT, 8(3), pp 243-281 Daniel Marcu (2000) The Theory and Practice of Discourse Parsing and Summarization The MIT Press Cambridge, Massachusetts Daniel Marcu, Estibaliz Amorrortu and Magdalena Romera (1999) Experiments in Constructing Discourse Trees A corpus of discourse trees In Proceedings of the ACL Workshop on Standards and Tools for Discourse Tagging, College Park, MD, pp 48-57 Elena Mitocariu, Daniel A Anechitei, Dan Cristea (2013) Comparing discourse tree structure In the 14th International Conference CICLing, Samos, Greece, March 24-30, pp 513-522 Yoshio Nakao (2000) An algorithm for one-page summarization of a long text based on thematic hierarchy detection In Proceedings of the ACL, pp 302-309 Bonnie L Webber (1991) Structure and ostension in the interpretation of discourse deixis Natural Language and Cognitive Processes, 6(2):107–135 Masao Utiyama and Hitoshi Isahara (2001) A statistical model for domain-independent text segmentation In Proceedings of the ACL, pp 499-506 2878